8367319: Add os interfaces to get machine and container values separately #27646

caspernorrbin · 2025-10-06T12:51:22Z

Hi everyone,

The current os:: layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples:

A user might need the physical cpu count of the machine, but os::active_processor_count() only returns the limited container value, which also represents something slightly different.
A user might want the container memory limit and the physical RAM size, but os::physical_memory() only gives one number.

To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with machine_ and container_. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values.

In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment.

OSContainer::active_processor_count() has also been changed to return double instead of int. The previous implementation rounded the quota/period ratio up to produce an integer for os::active_processor_count(). Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in os::active_processor_count().

Testing:

Oracle tiers 1-5
Container tests on cgroup v1 and v2 hosts.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8367319: Add os interfaces to get machine and container values separately (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27646/head:pull/27646
$ git checkout pull/27646

Update a local copy of the PR:
$ git checkout pull/27646
$ git pull https://git.openjdk.org/jdk.git pull/27646/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27646

View PR using the GUI difftool:
$ git pr show -t 27646

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27646.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-10-06T12:52:35Z

👋 Welcome back cnorrbin! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-10-06T12:54:09Z

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

openjdk · 2025-10-06T12:54:54Z

@caspernorrbin The following label will be automatically applied to this pull request:

hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-10-06T12:58:20Z

Webrevs

01: Full - Incremental (e59ff7c4)
00: Full (18527102)

albertnetymk · 2025-10-06T14:23:51Z

For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important.

Then, is it possible to keep the original API for "most use cases"? For others, they can query is_containerized and new machine_ APIs.

The current proposed API kind of forces all users to distinguish btw inside a container or not, even though most use cases don't care.

caspernorrbin · 2025-10-06T14:36:49Z

Then, is it possible to keep the original API for "most use cases"? For others, they can query is_containerized and new machine_ APIs.

The current proposed API kind of forces all users to distinguish btw inside a container or not, even though most use cases don't care.

The original API is untouched, so "most use cases" can still use that without ever worrying about containers. The machine_ functions is simply an add-on to that API. Both the examples above remain unchanged, os:: active_processor_count() still returns both machine/container values, and os::physical_memory() still returns the appropriate memory limit.

albertnetymk · 2025-10-06T15:50:52Z

The original API is untouched, ...

I see; I misunderstood the proposal. Then, there are 3 sets of public APIs, generic, machine_, and container_. I'd expect most use the generic ones and some might query the machine_ ones, so I don't see much need for container_ ones to be public APIs. YMMV.

jerboaa

This will conflict mightily with the refactoring that I'm working on for https://bugs.openjdk.org/browse/JDK-8365606

jerboaa · 2025-10-06T16:38:14Z

src/hotspot/share/runtime/os.cpp

+}
+
+double os::container_processor_count() {
+  assert(is_containerized(), "must be running containerized");


This is the equivalent of assert(false). Shouldn't this be ShouldNotReachHere()?

dholmes-ora · 2025-10-07T01:33:34Z

A user might need the physical cpu count of the machine, but os::active_processor_count() only returns the limited container value, which also represents something slightly different.

That is a mis-characterization of the API. active_processor_count() tells you how many logical processors are available to the JVM process. That can be very different to the "physical" (**) number of processors due to partitioning at various levels (e.g. virtualization, containerization), as well as direct restrictions through API's like taskset.

(**) "physical" actually has no meaning these days. There is some value you can obtain through the operating system that provides the maximum number of processors that the operating system can see (and thus make available to the JVM).

dholmes-ora · 2025-10-07T01:53:29Z

What is a "machine" here? Historically we have misused "physical" to mean what does a bare-metal OS report on a bare-metal piece of hardware. But that became inaccurate decades ago once virtualization/hypervisors arrived. So we've adjusted API's (e.g. MXBeans) to report whatever the "operating system" reports. The problem there is some things the operating system reports take into account the presence of containers, and others do not. This has always been a problem with these container environments - they should be invisible to software but they are not.

To support this, an os::is_containerized function should also be added.

For a long time this was an impossible question to answer accurately - we could query whether cgroups were configured on a system but we couldn't ask if the JVM process was running under any cgroup constraints - has that changed?

I would like to get a better idea of what kinds of "machine" information we need to query and how it will be used. I mean, how does it help to know a "machine" has 256 processors if the various software layers only make 16 available to you?

caspernorrbin · 2025-10-07T09:44:15Z

That is a mis-characterization of the API. active_processor_count() tells you how many logical processors are available to the JVM process. That can be very different to the "physical" (**) number of processors due to partitioning at various levels (e.g. virtualization, containerization), as well as direct restrictions through API's like taskset.

(**) "physical" actually has no meaning these days. There is some value you can obtain through the operating system that provides the maximum number of processors that the operating system can see (and thus make available to the JVM).

Agreed, I conflated the two here. What I actually should have written is like you said, the number of logical processors available for the JVM to execute on. That is also the value the new machine_active_processor_count() returns.

By contrast, the current container-reported value treats cpu quota and logical processors as the same thing, even though quota only restricts cpu time, not the number of cores we can run on. With a quota of 1, we might still execute on two cores for 50% of the time each, but os::active_processor_count() still only reports "1".

What is a "machine" here? Historically we have misused "physical" to mean what does a bare-metal OS report on a bare-metal piece of hardware. But that became inaccurate decades ago once virtualization/hypervisors arrived. So we've adjusted API's (e.g. MXBeans) to report whatever the "operating system" reports. The problem there is some things the operating system reports take into account the presence of containers, and others do not. This has always been a problem with these container environments - they should be invisible to software but they are not.

Machine in this context is the values the operating system reports, which could already be limited depending on the configuration. All this is of course Linux-only, as we don't support containers on any other platforms. In many container deployments the cgroup limits do differ from the OS view, but in a fully virtualized environment they can coincide. In that situation none of this makes a difference anyways, and both functions would report the same value.

For a long time this was an impossible question to answer accurately - we could query whether cgroups were configured on a system but we couldn't ask if the JVM process was running under any cgroup constraints - has that changed?

Our container detection still isn't perfect, but it has improved:

We first check whether all cgroup controllers are mounted read-only, which is the default for many container runtimes.
If not, we examine the JVM's cgroup path to see if there are any memory/cpu limits present (covers JVMs started in restricted systemd slices).

These heuristics can miss more exotic setups but are pretty accurate for most use cases today.

I would like to get a better idea of what kinds of "machine" information we need to query and how it will be used. I mean, how does it help to know a "machine" has 256 processors if the various software layers only make 16 available to you?

Like I mentioned above when explaining os::active_processor_count(), it can be very relevant to know that the underlying machine (virtualized or not) has 16 cpus available, even though the JVM's cgroup quota restricts us to an average of 2 cpus. In latency-sensitive workloads we might burst onto all 16 cores for a short interval and still stay within the 2-cpu quota. At the moment, the JVM has no way to know that those extra 14 cores even exist, so it cannot make that optimization.

The same argument applies to memory. GC heuristics may want to look at overall memory pressure on the machine, not just the container's limit. Imagine several containers on one host, where one is consuming most of its limit while others are relatively idle. From the host perspective the system is still comfortable, so the GC could be less aggressive compared to if every container was close to its limit. Without access to the host-level numbers we can't distinguish these scenarios.

Admittedly these functions are niche and will only matter for very specialised, performance-critical tasks. Still, the information is already available from the operating system, and the JVM should not hide or overwrite it for those users who can benefit from it.

caspernorrbin · 2025-10-07T09:47:51Z

This will conflict mightily with the refactoring that I'm working on for https://bugs.openjdk.org/browse/JDK-8365606

Understood, thanks for flagging that. The amount of change in the container layer isn't that big except for CgroupUtil::processor_count(), which wasn't of a java type to begin with. Is there something I could change to lessen the conflict at all?

dholmes-ora · 2025-10-07T10:52:10Z

By contrast, the current container-reported value treats cpu quota and logical processors as the same thing, even though quota only restricts cpu time, not the number of cores we can run on. With a quota of 1, we might still execute on two cores for 50% of the time each, but os::active_processor_count() still only reports "1".

Right - the way the "container folk" dealt with quotas etc never made any sense to me at all. I tried arguing the point but to no avail. So basically what you are looking for here is a way to get around the "broken" definition of available-processors when quotas are enforced.

caspernorrbin · 2025-10-07T12:19:43Z

So basically what you are looking for here is a way to get around the "broken" definition of available-processors when quotas are enforced.

Partly, yes. The cpu quota is the clearest example, but the same mismatch shows up for other values. The meaning of the container values don't always map 1:1 to OS numbers. What I'm after is a clean way to get those numbers separately. Pairing machine_ and container_ gives access to both pieces of data, offering just the machine side would leave us without a direct path to the container value.

jerboaa · 2025-10-07T12:20:29Z

This will conflict mightily with the refactoring that I'm working on for https://bugs.openjdk.org/browse/JDK-8365606

Understood, thanks for flagging that. The amount of change in the container layer isn't that big except for CgroupUtil::processor_count(), which wasn't of a java type to begin with. Is there something I could change to lessen the conflict at all?

Separating out the processor_count() change would be good (irrespective of the conflict). But other than that, I'd think there isn't really a good way. Perhaps to wait for JDK-8365606 to land as those are way more changes to deal with rather than what is done here. I.e. merge should be simpler. I should have something ready by end of the week, fwiw.

separate-machine-container-functions

1852710

openjdk bot changed the title ~~8367319~~ 8367319: Add os interfaces to get machine and container values separately Oct 6, 2025

openjdk bot added the hotspot [email protected] label Oct 6, 2025

openjdk bot added the rfr Pull request is ready for review label Oct 6, 2025

Fixed print type

e59ff7c

jerboaa reviewed Oct 6, 2025

View reviewed changes

8367319: Add os interfaces to get machine and container values separately #27646

Are you sure you want to change the base?

8367319: Add os interfaces to get machine and container values separately #27646

Conversation

caspernorrbin commented Oct 6, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewing

Uh oh!

bridgekeeper bot commented Oct 6, 2025

Uh oh!

openjdk bot commented Oct 6, 2025

Uh oh!

openjdk bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlbridge bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

albertnetymk commented Oct 6, 2025

Uh oh!

caspernorrbin commented Oct 6, 2025

Uh oh!

albertnetymk commented Oct 6, 2025

Uh oh!

jerboaa left a comment

Choose a reason for hiding this comment

Uh oh!

jerboaa Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

dholmes-ora commented Oct 7, 2025

Uh oh!

dholmes-ora commented Oct 7, 2025

Uh oh!

caspernorrbin commented Oct 7, 2025

Uh oh!

caspernorrbin commented Oct 7, 2025

Uh oh!

dholmes-ora commented Oct 7, 2025

Uh oh!

caspernorrbin commented Oct 7, 2025

Uh oh!

jerboaa commented Oct 7, 2025

Uh oh!

Uh oh!

caspernorrbin commented Oct 6, 2025 •

edited by openjdk bot

Loading

openjdk bot commented Oct 6, 2025 •

edited

Loading

mlbridge bot commented Oct 6, 2025 •

edited

Loading